Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: use official client instead of fsouza/go-dockerclient #23966

Merged
merged 29 commits into from
Sep 26, 2024

Conversation

pkazmierczak
Copy link
Contributor

@pkazmierczak pkazmierczak commented Sep 16, 2024

This PR replaces fsouza/go-dockerclient 3rd party docker client library with
docker's official SDK.

Notes to reviewers:

  • this PR is split into 3 commits, one containing driver logic, one for the
    unit tests, and one for go mod/sum
  • the vast majority of the refactoring is pretty straightforward, replacing
    methods, changing argument orders and types. There are a few exceptions that
    require attention, namely:
    • any methods that utilize capturing stdio/stdout/stderr had to be re-written,
      e.g., dockerLogger.Start, driver.ExecTaskStreaming
    • signal handling in the task handler,
    • stats collector

@pkazmierczak pkazmierczak force-pushed the f-docker-official-client branch 2 times, most recently from 98d1578 to 7773315 Compare September 19, 2024 19:50
@pkazmierczak
Copy link
Contributor Author

@pkazmierczak pkazmierczak merged commit 981ca36 into main Sep 26, 2024
26 checks passed
@pkazmierczak pkazmierczak deleted the f-docker-official-client branch September 26, 2024 16:41
tgross added a commit that referenced this pull request Sep 27, 2024
In ##23966 when we switched to using the official Docker SDK client, we had more
contexts to add because most of the library methods take one. But for some APIs
like waiting for a container to exit after we've started it, we never want to
close this context, because the operation can outlive the Nomad agent itself.
tgross added a commit that referenced this pull request Sep 27, 2024
In ##23966 when we switched to using the official Docker SDK client, we had more
contexts to add because most of the library methods take one. But for some APIs
like waiting for a container to exit after we've started it, we never want to
close this context, because the operation can outlive the Nomad agent itself.
tgross added a commit that referenced this pull request Sep 27, 2024
In ##23966 when we switched to using the official Docker SDK client, we had more
contexts to add because most of the library methods take one. But for some APIs
like waiting for a container to exit after we've started it, we never want to
close this context, because the operation can outlive the Nomad agent itself.
tgross added a commit that referenced this pull request Sep 30, 2024
In ##23966 when we switched to using the official Docker SDK client, we had more
contexts to add because most of the library methods take one. But for some APIs
like waiting for a container to exit after we've started it, we never want to
close this context, because the operation can outlive the Nomad agent itself.
tgross added a commit that referenced this pull request Sep 30, 2024
In ##23966 when we switched to using the official Docker SDK client, we had to
rework the stats collection loop for the new client. But we missed resetting the
timer on the collection loop, which meant that we'd only collect stats once and
then never again.

* Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=550814)
* This has shipped in Nomad 1.9.0-beta.1 but not production yet.
tgross added a commit that referenced this pull request Oct 1, 2024
In ##23966 when we switched to using the official Docker SDK client, we had to
rework the stats collection loop for the new client. But we missed resetting the
timer on the collection loop, which meant that we'd only collect stats once and
then never again.

* Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=550814)
* This has shipped in Nomad 1.9.0-beta.1 but not production yet.
tgross added a commit that referenced this pull request Oct 1, 2024
In ##23966 when we switched to using the official Docker SDK client, this
included new API calls for attaching to the "exec objects" created for running
processes inside a running Docker task. When we updated the API for the
non-streaming cases (script health checks, and `change_mode = "script"`), we
used the container ID and not the exec object ID. These IDs aren't identical
because you can have multiple exec objects for a given container. This results
in errors like "unable to upgrade to tcp, received 404" because the Docker API
can't find the exec object with the container ID.

* Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=551618)
* This has shipped in Nomad 1.9.0-beta.1 but not production yet.
tgross added a commit that referenced this pull request Oct 1, 2024
In ##23966 when we switched to using the official Docker SDK client, this
included new API calls for attaching to the "exec objects" created for running
processes inside a running Docker task. When we updated the API for the
non-streaming cases (script health checks, and `change_mode = "script"`), we
used the container ID and not the exec object ID. These IDs aren't identical
because you can have multiple exec objects for a given container. This results
in errors like "unable to upgrade to tcp, received 404" because the Docker API
can't find the exec object with the container ID.

* Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=551618)
* This has shipped in Nomad 1.9.0-beta.1 but not production yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants